Here is a simple language that we can use to understand the difference between
parsing an operator as left or right associative.

    L = { a, a+a, a+a+a, a+a+a+a, ... }


-----------------------------------------------------------------------------


1.) Here is an EBNF grammar for language L.


EBNF:    expr -> a ( '+' a )*


Using this grammar, how might you parse the string "a+a+a+a"?
It doesn't seem that you can.


-----------------------------------------------------------------------------


2.) Here is an ambiguous BNF grammar for language L.


BNF:    expr -> expr '+' expr | a


Using this grammar you can parse the string "a+a+a+a" three different ways.


-----------------------------------------------------------------------------


3.) Here is an unambiguous, "right associative", BNF grammar for language L
(and two EBNF grammars derived from it).


BNF:    expr -> a '+' expr | a    // notice that this is right recursive


EBNF:   expr -> a [ '+' expr ]    // notice that this is still right recursive

EBNF:   expr -> a ( '+' a )*      // we removed (or "eliminated") the recursion


Using the BNF grammar, or the first EBNF grammar, we parse the
string "a+a+a+a" as if it were "grouped" as a+(a+(a+a)).


Let us look at how the right recursion can be removed from the BNF grammar.

Here is a sequence of right-most derivations of sentential forms from the BNF grammar.

        a + expr
        a + a + expr
        a + a + a + expr
        a + a + a + a + expr
        a + a + a + a + a

Notice how the "expression part" is moving to the right, and in each step
it grows the expression by concatenating the string "+ a" onto what had
been to its left. So that means we can think of growing a string in the
language by starting with the sting "a" and then concatenating on the right
as many "+ a" strings as we like. Hence, the description
       expr -> a ( '+' a )*
which uses iteration (the Kleene star) in place of recursion.

-----------------------------------------------------------------------------


4.) Here is an unambiguous, "left associative", BNF grammar for language L
(and three EBNF grammars derived from it).


BNF:    expr ->  expr '+' a | a   // notice that this is left recursive


EBNF:   expr -> [ expr '+' ] a    // notice that this is still left recursive

EBNF:   expr -> ( a '+' )* a      // we removed (or "eliminated") the recursion

EBNF:   expr -> a ( '+' a )*

Using the BNF grammar, or the first EBNF grammar, we parse the
string "a+a+a+a" as if it were "grouped" as ((a+a)+a)+a.

Notice that the grammars that use the Kleene star don't reeally tell
us how we should parse a string (we will see below why not).


Notice, from the last two BNF grammars, that, in essence, a left-associative
operator means we have a left-recursive grammar production that captures the
"big stuff" on the left side of the operator and it captures a "little thing"
on the right side of the operator, e.g.,
       expr -> expr '+' a
and a right-associative operator means we have a right-recursive grammar
production that captures "big stuff" on the right side of the operator
and a "little thing" on the left side of the operator, e.g.,
       expr -> a '+' expr



Let us look at how the left recursion can be removed from the BNF grammar.

Here is a sequence of left-most derivations of sentential forms from the BNF grammar.

                 expr + a
             expr + a + a
         expr + a + a + a
     expr + a + a + a + a
        a + a + a + a + a

Notice how the "expression part" is moving to the left, and in each step
it grows the expression by concatenating the string "a +" onto what had
been to its right. So that means we can think of growing a string in the
language by starting with the sting "a" and then concatenating on the left
as many "a +" strings as we like. Hence, the description
       expr -> ( a '+' )* a
which uses iteration (the Kleene star) in place of recursion.

But we can just as reasonably look at the final string
      a + a + a + a + a
and say that we grow this string by starting with the string "a" and then
concatenating on the right(!) as many "+ a" strings as we like. Hence, we
can also get this description of the language.
       expr -> a ( '+' a )*

So the left recursion in this BNF
       expr ->  expr '+' a | a
can be factored out using either this EBNF
       expr -> ( a '+' )* a
or this EBNF
       expr -> a ( '+' a )*
The second form of EBNF is preferable since it is easy to translate it
into a while-loop that parses the language. But the second EBNF is also the
EBNF that we derived from the right-recusive grammar! Since right-recursion
gives us right-associativity, and left-recursion gives us left-associativity,
and the last ENBF can be derived from either the right or left recursive BNF's,
how can the EBNF grammar determine associativity? Well, it doesn't. The
associativety of the operator will be determined by how we write the parser,
not by how we wrote the (EBNF) grammar. Let's look at an example.

Here is a (non-recursive) piece of code that parses the grammar

    expr ->  expr '+' a | a

void getExpr(tokens)
{
   tokens.match("a");
   while (tokens.hasToken() )   // iteration instead of recursion
   {
       tokens.match("+");
       tokens.match("a");
   }
}

This is a recognizing parser. It doesn't do anything but parse (and
throw an exception if a stream of tokens doesn't parse).

Here is how we can modify this parser so that it builds a parse tree
(or an abstract syntax tree).

Tree getExpr(tokens)
{
   tokens.match("a");
   Tree currentTree = new Tree("a");

   while ( tokens.hasToken() )
   {
       tokens.match("+");
       tokens.match("a");
       currentTree = new Tree("+"  currentTree  "a")   // left associative
   }
   return currentTree;
}

Follow this code as it parses the string "a+a+a+a" (INPORTANT: really
do follow this code as it parses the string). It parses the string
into a left-associative parse tree.

But now modify the code this way.

Tree getExpr(tokens)
{
   tokens.match("a");
   Tree currentTree = new Tree("a");

   while ( tokens.hasToken() )
   {
       tokens.match("+");
       tokens.match("a");
       currentTree = new Tree("+"  "a"  currentTree)   // right associative?
   }
   return currentTree;
}


Again, follow this code as it parses the string "a+a+a+a" (INPORTANT: really
do follow this code as it parses the string). Now it parses the string
into (what seems to be) a right-associative parse tree. But there's a problem.


Modify the parser once again (so that it can parse strings with variables
other than "a").

Tree getExpr(tokens)
{
   String tk = tokens.nextToken();
   Tree currentTree = new Tree(tk);

   while ( tokens.hasToken() )
   {
       tokens.match("+");
       tk = tokens.nextToken();
       currentTree = new Tree("+"  tk  currentTree)   // right associative
   }
   return currentTree;
}

Now follow the above parser as it parses the string "a+b+c+d". You will
see that it is not really parsing the expression to be right-associative.
It's not even paring the string correctly.

But if you tokenize the string "a+b+c+d" from right-to-left, so the token
stream is
    ["d", "+", "c", "+", "b", "+", "a"]
and then you once again follow the parser as it parses this token stream,
then you should get a correct, right-associative, parse tree.


The last several examples show that the EBNF grammar

       expr -> a ( '+' a )*

DOES NOT determine any associativity for the operator. It doesn't really tell
us how to parse. But we can use the grammar as a guide to implement parsers
for either a left-associative operator or a right-associative operator (but
the right-associative parser needs a right-to-left tokenizer!).

Of course, if we really want a right-associative operator, we should use the
right recursive grammar

       expr -> a '+' expr | a

and write a recursive descent parser for this grammar, and use a left-to-right
tokenizer!



Question: What does the following (recursive) EBNF give you?

EBNF:   expr -> ( expr '+' )* a